gh-145264: Do not ignore excess Base64 data after the first padded quad#145267
gh-145264: Do not ignore excess Base64 data after the first padded quad#145267serhiy-storchaka merged 4 commits intopython:mainfrom
Conversation
…ded quad Base64 decoder (see binascii.a2b_base64(), base64.b64decode(), etc) no longer ignores excess data after the first padded quad in non-strict (default) mode. Instead, in conformance with RFC 4648, it ignores the pad character, "=", if it is present before the end of the encoded data.
50967e0 to
0229b06
Compare
| @@ -0,0 +1,4 @@ | |||
| Base64 decoder (see :func:`binascii.a2b_base64`, :func:`base64.b64decode`, etc) no | |||
| longer ignores excess data after the first padded quad in non-strict | |||
| (default) mode. Instead, in conformance with :rfc:`4648`, it ignores | |||
There was a problem hiding this comment.
I guess this is in accordance with the MAY in https://datatracker.ietf.org/doc/html/rfc4648#section-3.3 about ignoring PADs as non-alphabet data? it'd be good to cite the specific section.
Lib/test/test_binascii.py
Outdated
| # Test excess data exceptions | ||
| def assertExcessData(data, non_strict_expected, | ||
| ignore_padchar_expected=None): | ||
| def assertExcessData(data, non_strict_expected): |
There was a problem hiding this comment.
rename this from non_strict_expected to just expected.
There was a problem hiding this comment.
In strict mode you get an error. You get that value only in non-strict mode, either when strict_mode=False, or when ignorechars contains "=".
But I agree that expected is shorter. The old name was even longer: non_strict_mode_expected_result.
|
@gpshead, could you please look again at that PR? |
|
Is it fine for you, @gpshead? |
|
Thanks @serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14. |
|
Sorry, @serhiy-storchaka, I could not cleanly backport this to |
|
Sorry, @serhiy-storchaka, I could not cleanly backport this to |
…rst padded quad (pythonGH-145267) Base64 decoder (see binascii.a2b_base64(), base64.b64decode(), etc) no longer ignores excess data after the first padded quad in non-strict (default) mode. Instead, in conformance with RFC 4648, it ignores the pad character, "=", if it is present before the end of the encoded data. (cherry picked from commit 4561f64) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
|
GH-146326 is a backport of this pull request to the 3.14 branch. |
…8577 * 'main' of github.com:python/cpython: pythongh-146197: Run -m test.pythoninfo on the Emscripten CI (python#146332) pythongh-146325: Use `test.support.requires_fork` in test_fastpath_cache_cleared_in_forked_child (python#146330) pythongh-146197: Add Emscripten to CI (python#146198) pythongh-143387: Raise an exception instead of returning None when metadata file is missing. (python#146234) pythongh-108907: ctypes: Document _type_ codes (pythonGH-145837) pythongh-146175: Soft-deprecate outdated macros; convert internal usage (pythonGH-146178) pythongh-146056: Rework ref counting in treebuilder_handle_end() (python#146167) Add a warning about untrusted input to `configparser` docs (python#146276) pythongh-145264: Do not ignore excess Base64 data after the first padded quad (pythonGH-145267) pythongh-146308: Fix error handling issues in _remote_debugging module (python#146309) pythongh-146192: Add base32 support to binascii (pythonGH-146193) pythongh-135953: Properly obtain main thread identifier in Gecko Collector (python#146045) pythongh-143414: Implement unique reference tracking for JIT, optimize unpacking of such tuples (pythonGH-144300) pythongh-146261: Fix bug in `_Py_uop_sym_set_func_version` (pythonGH-146291) pythongh-145144: Add more tests for UserList, UserDict, etc (pythonGH-145145) pythongh-143959: Fix test_datetime if _datetime is unavailable (pythonGH-145248) pythongh-146245: Fix reference and buffer leaks via audit hook in socket module (pythonGH-146248) pythongh-140049: Colorize exception notes in `traceback.py` (python#140051) Update docs for pythongh-146056 (pythonGH-146213)
Base64 decoder (see binascii.a2b_base64(), base64.b64decode(), etc) no longer ignores excess data after the first padded quad in non-strict (default) mode. Instead, in conformance with RFC 4648, it ignores the pad character, "=", if it is present before the end of the encoded data.